Chicago Bulls Prospective
Player Analysis `19-20
1
(1)
Chicago Bulls NBA 2019-2020 Prospetive Player Report
Introduction:
Introduction
(2)
(3)
(4)
(Mara_undated-by?)
(5)
1. Introduction:
This project is based around the “Moneyball” theory of using statistical analysis to provide a greater insight into sport performances, in this case the selection/purchase of players from the 2018-19 season of the NBA who would help produce greater results for the Chicago Bulls organisation to improve on their past season result (finishing 13th in the Eastern Conference, and 27th overall on Win-Loss ratio) and provide an improved result for the upcoming 2019-2020 NBA season.
The assigned task included the following: * The assessment of potential players to purchase or retain for the Chicago Bulls organisation for the 2019-20 NBA season. * Projection of expected results with selected players. * Selection of 5 players, one from each position (Centre, Power Forward, Small Forward, Shooting Guard, Point Guard). * Ensure purchase of the 5 players was within the allotted budget of $118 million dollars. * The proposed purchases must allow enough budget to still field the other remaining players required for an NBA team (NBA teams are allowed 15 players total).
(1)
The use of statistics in sport is not a new phenomeno `(Noauthor_undated-pl?)’ , partly due to people like Bill James and John Hollinger who implemented and revolutionised the use of statistical analysis, and is now common within sports like basketball and in particular the North American basketball league the NBA. John Hollinger created the all in one metric the Player Efficiency Rating or PER, which allowed for the collection of several variables (i.e both positive and negative outcomes e.g turnovers/)
The hypothesis for this project is based on the use of a combination of known analysis methods to create a predictive equation to aid in the selection of appropriate players for the Chicago Bulls 19/20 season in the NBA.
The variables were used to show an association of an increase in overall Win% due to an increase of points per minute played. The variables used for the predictive value were:
Effective Field Goal Percentage (eFGp) Trade Value (TrV) Efficiency rate (EFF) Usage Rate (Tm_use) Total Rebounds per minute (TRB_MP) Points per minute (PTS_per_MP)
This section should provide relevant background information and justification for the project, including:
- relevant background information of basketball, including key metrics, position requirements etc
Key metrics: * Minutes played * Offensive value * Deffensive value (Rebounds) * Offensive value vs Defensive value * Assists * Points * Points per minute played * Rebounds * Offensive rebound % * Deffensive rebound % * Turnovers * Free Throw attempts * Free throw percentage * Attempts in the paint * Fouls * Regular season vs post season
** PER = Player efficiency rating *
From Wikipedia, the free encyclopedia Jump to navigationJump to search In basketball, effective field goal percentage (abbreviated eFG%) is a statistic that adjusts field goal percentage to account for the fact that three-point field goals count for three points while field goals only count for two points. [1] Its goal is to show what field goal percentage a two-point shooter would have to shoot at to match the output of a player who also shoots three-pointers. [2]
It is calculated by: \[ eFG(\%) =\frac{FG+(0.5*3P)}{FGA} \]
eDG% calculation
Total Rebounds/Minute
where:
FG = field goals made 3P = 3-point field goals made, FGA = field goal attempts, [3]
A rough approximation can also be had by:
\[ eFG(\%) =\frac{\frac{PPG-FT}{2}}{FGA} \]
where:
PPG = points per game FT = the free throws made FGA = field goal attempts The advantage of this second formula is that it highlights the aforementioned logic behind the statistic, where it is pretended that a player only shot two-point shots (hence the division of non-free-throw points by 2).
An additional formula that seems to be more in use by the statistics actually displayed on websites (but less cited by said websites) is: $$ eFG(%) =
$$
where:
2FG = 2-point field goals made 3FG = 3-point field goals made FGA = field goal attempts
Usage rate:
Usage rate, a.k.a., usage percentage is an estimate of the percentage of team plays used by a player while he was on the floor.
Usage Rate Formula
100((Player’s Field Goal Attempts)+0.44(Player’s Free Throw Attempts)+(Player’s Turnovers))(Team’s Total Minutes) / ((Team’s Total Field Goal Attempts)+0.44(Team’s Total Free Throw Attempts)+Team’s Total Turnovers))5(Player’s Minutes)
By balancing usage rates and the varying offensive ratings of the five players on the court, a team can achieve optimal offensive output. The stats show that, for all players, as the player uses more possessions, his efficiency decreases. What defines a superstar, in Dean Oliver‘s statistical analysis, is that he can shoulder a larger proportion of a team’s possessions with only a relatively small drop in efficiency. Meanwhile, the opposite is also true: Players perform more efficiently when they are asked to use fewer of their team’s possessions. As a result, the greater burden on the superstar means that supporting players maintain low usage rates, allowing them to operate closer to their peak efficiency.
In an effort to determine how much impact players have on their teams, sports statisticians have developed metrics such as Usage Percentage. Examining Usage Percentage gives us an indication of how efficient a player is given the amount of possessions he uses.
What defines a quality player is someone who can have a high Usage Percentage, but still plays at a high rate of efficiency. Teams can look at the Usage Percentage of players on their team, and determine how to balance usage across their lineup to maximize team efficiency.
Although the formula itself looks a bit more complicated, the basic idea is to look at a player’s combination of field goal attempts, free throw attempts and turnovers, and find the percentage of the team totals he uses in those same categories.
Some of the all-time leaders in this category include Michael Jordan, Allen Iverson, George Gervin, Dominique Wilkins and Shaquille O’Neal.
(usage-percentage-ref-pl?)
2. Report scenario:
This report forms the tangible component of a reproducible data analysis project of a task given to the data analytics team by the Chicago Bulls GM . The task detailed the assessment of potential players to join/retain for the Chicago Bulls organisation for the 2019-20 NBA season.
The projected budget for player contracts for the 2019-20 season is $118 million dollars.
The aim of the project
This report and analysis aims to provide five starting players (PG, C, SG, PF & SF) of the highest value based upon a cost-benefit analysis. The purchase of the proposed athletes/players still allows sufficient budget to complete the remaining roster.
Justification and importance
The previous 2018-19 season saw the Chicago Bulls finish 27th out of 30 teams in the NBA (on win-loss record). The Chicago Bulls organisation has aspirations to rebuild their line-up and field a team with championship title potential for the upcoming 2019-20 season.
Note that you may choose a different order to present each of the elements listed above. ###
## 2. Reading and cleaning the raw data
This section should document the process used to read and clean the raw data. It should also include a description of the data sets used and variables in each. For brevity, you could provide a link to the specific variable descriptions, rather than writing these out in full within your report.
How to use this project:
This project was designed and build through RStudio, Version 1.4.1103, © 2009-2021 RStudio, PBC
This Readme file and GitHub repo covers the following:
- File locations
- Operational order
- Data sources
- Raw data
- Tidy data
- Glossary of definitions and abbreviations used for each data base
- Relevant calculations
File Locations
- Rmarkdown -> Chicago_Bulls_u125511
- Data:
* Project-data
* Tidy_data
- Funcs -> Rscript files (in descending operational order)
* Bulls_Fresh_start.R
* Teams_wins_loss_df.R
* Exploratory_Analysis_2.R
* Bulls_multi_reg.R
- Figures
- Images
- Funcs
- References
Operational order:
Scripts to be loaded and run through Rstudio program 1. Bulls_Project_Fresh_start.R 2. Team_wins_loss_df.R 3. Explortory_Analysis_2.R 4. Bulls_Multi_reg R
Produced "*.csv" data frames
The following .csv files will be exported locally into the data/tidy_data folder:
player_stats_tidy.csv * df_team_Stats1.csv * df_team_Stats2.csv * df_TOT_players.csv * df_nonTOT_clean.csv * datC.csv * datPF.csv * datPG.csv * datSF.csv * datSG.csv
Data sources
Glossary:
NBA standard terms:
* Glossary of NBA Statistics
* Basteball Positions
Project specific:
* Pos = Position
* Tm = Team, abbreviated to three letters, i.e Chicago = CHI, Houston = HOU etc.
* ‘…MP’ = Statistic at a per minute rate
* ’TM…’ = Statistic as a team total
* Tm_use_total = Total use by the team as a percentage across the total number of minutes played
Data sources:
3. Exploratory analysis:
NBA player group
linear
This section should document your exploratory data analysis and may include but is not limited to:
checking for errors and missing values within the data sets
checking the distribution of variables
c) checking for relationships between variables, or differences between groups
d) justification for decisions made about data modelling
Note that this section and the data cleaning section may be an iterative process, as you might find things about the data that need to be ‘cleaned up’ once you have explored the data further.
4. Data modelling and results:
This section may include but is not limited to:
- data modeling (e.g. creating a linear regression)
need to check source
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | -0.3821332 | 0.0180836 | -21.1315032 | 0.0000000 | -0.4177259 | -0.3465405 |
| eFGp | 0.6985608 | 0.0314831 | 22.1884530 | 0.0000000 | 0.6365947 | 0.7605269 |
| TRB_MP | -0.0329833 | 0.0174981 | -1.8849604 | 0.0604419 | -0.0674237 | 0.0014572 |
| Tm_use_total | 2.3855803 | 0.0366744 | 65.0474831 | 0.0000000 | 2.3133963 | 2.4577642 |
| EFF | 0.0000096 | 0.0000043 | 2.2488204 | 0.0252797 | 0.0000012 | 0.0000181 |
| TrV | -0.0000080 | 0.0000104 | -0.7740553 | 0.4395330 | -0.0000284 | 0.0000124 |
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | -0.3821332 | 0.0180836 | -21.1315032 | 0.0000000 | -0.4177259 | -0.3465405 |
| eFGp | 0.6985608 | 0.0314831 | 22.1884530 | 0.0000000 | 0.6365947 | 0.7605269 |
| TRB_MP | -0.0329833 | 0.0174981 | -1.8849604 | 0.0604419 | -0.0674237 | 0.0014572 |
| Tm_use_total | 2.3855803 | 0.0366744 | 65.0474831 | 0.0000000 | 2.3133963 | 2.4577642 |
| EFF | 0.0000096 | 0.0000043 | 2.2488204 | 0.0252797 | 0.0000012 | 0.0000181 |
| TrV | -0.0000080 | 0.0000104 | -0.7740553 | 0.4395330 | -0.0000284 | 0.0000124 |
[1] 0.483507
Predictive formula based off multiple regression model:
- eFG = 0.55
- TRB_MP = .2
- Tm_use_total = 0.2
- EFF = 1500
- TrV = 600
\[ \beta_1 = -0.382 + 0.699 * 0.55 + -0.0330 * 0.2 + 2.39 * 0.20 + 0.00000965 * 1500 + -0.00000803 * 600 \]
assumption checking
model output and interpretation of your model
5. Player Analysis:
Points per/min vs Salary:
Player vs Salary analysis:
Trade Value vs Salary:
6. Player recommendations:
This section will be the key part that is presented to the general manager. Here you should present your recommendations for the best five starting players, but also think about what other important information they would want to know, and how it is best to present that information to them.
7. Summary:
Provide a brief summary which describes the key points and findings from your project. It will also be important to acknowledge any limitations of your model and overall approach to answering the question asked of you by the general manager.
- Reference List
Provide a reference list of any sources you used in the development of your report and justification of your arguments. Please use the Vancouver reference style (Links to an external site.) for the reference list and in-text references.
Roses are \(\color{red}{\text{beautiful red}}\), violets are \(\color{blue}{\text{lovely blue}}\).
text To put multiple plots in a single row I set the out.width to 50% for two plots, 33% for 3 plots, or 25% to 4 plots, and set fig.align = “default.” Depending on what I’m trying to illustrate (e.g. show data or show plot variations), I’ll also tweak fig.width, as discussed below.
If you find that you’re having to squint to read the text in your plot, you need to tweak fig.width. If fig.width is larger than the size the figure is rendered in the final doc, the text will be too small; if fig.width is smaller, the text will be too big
fig.show = “hold” fig.cap = "" this will give caption and change figure to floating rather than inline
may need to re-add data frames between steps. If so, using cache = TRUe, dependson = “….”) allows that {r figure, include=FALSE} fig.width = 6 (6“) and fig.asp = 0.618 out.width =”70%" and fig.align = “center”
##```{r Table1 test, echo=FALSE} knitr::kable((model_testing), select(player_name, Tm, Age, Pos, salary, TrV, EFF, Tm_use_total,PTS_per_MP, TRB_MP, exp_PTS_per_MP) %>% arrange(desc(exp_PTS_per_MP), salary) %>% top_n(20), caption = “Top 20 player selections.”) # initial table